-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop using null terminators on JSON (de)serialization. #5353
Conversation
// Trim null terminator if exists. | ||
if (!buffer.empty() && buffer.back() == '\0') { | ||
buffer = buffer.first(buffer.size() - 1); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought this might not be needed, considering that the null terminator gets trimmed on the server side.
::capnp::MallocMessageBuilder message_builder; | ||
auto builder = | ||
message_builder.initRoot<capnp::LoadArraySchemaResponse>(); | ||
json.decode(kj::StringPtr(data.data()), builder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can validate that I did not miss a case by seeing there are zero occurences of kj::StringPtr
in the codebase.
CI is green, apart from an unrelated failure that is expected to be fixed in #5355. This is ready for review. |
…relying on the null termination.
TileDB-Go removes it either way.
We were missing a header, and we can pass a `kj::ArrayPtr` which does not require being null-terminated.
1878cb1
to
98adaf4
Compare
Question before I review this PR , what is the motivation for this change? Do you expect some performance improvement or some other sort of benefit from this refactoring? |
@ypatia the motivation is to unify the diverging logic that depends on the format being capnp or JSON and have code always just send the whole buffer to the client. This will both simplify the code and enable some (likely significant) optimizations outlined in SC-58009. |
As discussed yesterday in planning, it'd be good to test this a bit more thoroughly just in case we are missing something that could be potentially breaking. I don't have context on why this null terminator needed to be added in the past, so I don't feel comfortable removing it just like this. Let's at least run REST-CI with JSON as default mode for this PR, @shaunrd0 said he can help with that if needed. Then we could also consider some testing using the JS client that is using JSON as serialization type, @SarantopoulosKon could help with instructions if we decide to go that way. |
Some REST-CI tests failed with |
I believe this might indeed cause issues, as in some places it was assumed there would be a null terminator. i.e https://github.com/TileDB-Inc/TileDB-Go/blob/9ec3a9462135bf6005dcc06a1f44a6dc24afecba/buffer.go#L107 |
The attached snippet does not assume the existence of a null terminator. If there isn't one, |
I found the cause of the segfault: TileDB/test/src/unit-capi-array.cc Lines 1001 to 1002 in 55c5d95
TileDB/test/src/unit-capi-array.cc Lines 1011 to 1018 in 55c5d95
TileDB/test/src/unit-capi-array.cc Line 1151 in 55c5d95
The array writes failed because serializing queries in JSON is not supported but we continued execution because of the @SarantopoulosKon let's sync tomorrow on how to run the Cloud-JS tests with this PR. |
We can't read or write itbecause queries do not support JSON serialization.
This reverts commit c3c4fd3.
I added a REST CI test that creates an array using JSON serialization. Let me know if this would be sufficient for validating this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this change and for you effort to test it. If all goes well, it will enable an important optimization server side that we believe it can bring both memory and latency improvements.
CI is green, merging… |
[SC-58051](https://app.shortcut.com/tiledb-inc/story/58051/stop-using-null-terminators-on-json-de-serialization) This PR updates the JSON deserialization code to explicitly pass the message's length to capnproto, instead of appending a null terminator and relying on it to get the length. Correspondingly, it removes the null terminator in serialized JSON messages, which is technically a behavior breaking change but the null terminator gets already removed by both [the Go API](https://github.com/TileDB-Inc/TileDB-Go/blob/9ec3a9462135bf6005dcc06a1f44a6dc24afecba/buffer.go#L105-L107) and the REST server (see story), and the serialization APIs are documented as being for internal use. --- TYPE: NO_HISTORY
SC-58051
This PR updates the JSON deserialization code to explicitly pass the message's length to capnproto, instead of appending a null terminator and relying on it to get the length. Correspondingly, it removes the null terminator in serialized JSON messages, which is technically a behavior breaking change but the null terminator gets already removed by both the Go API and the REST server (see story), and the serialization APIs are documented as being for internal use.
TYPE: NO_HISTORY